12 research outputs found
Honey Sheets: What Happens to Leaked Google Spreadsheets?
Cloud-based documents are inherently valuable, due to the volume and nature
of sensitive personal and business content stored in them. Despite the
importance of such documents to Internet users, there are still large gaps in
the understanding of what cybercriminals do when they illicitly get access to
them by for example compromising the account credentials they are associated
with. In this paper, we present a system able to monitor user activity on
Google spreadsheets. We populated 5 Google spreadsheets with fake bank account
details and fake funds transfer links. Each spreadsheet was configured to
report details of accesses and clicks on links back to us. To study how people
interact with these spreadsheets in case they are leaked, we posted unique
links pointing to the spreadsheets on a popular paste site. We then monitored
activity in the accounts for 72 days, and observed 165 accesses in total. We
were able to observe interesting modifications to these spreadsheets performed
by illicit accesses. For instance, we observed deletion of some fake bank
account information, in addition to insults and warnings that some visitors
entered in some of the spreadsheets. Our preliminary results show that our
system can be used to shed light on cybercriminal behavior with regards to
leaked online documents
All Your Cards Are Belong To Us: Understanding Online Carding Forums
Underground online forums are platforms that enable trades of illicit
services and stolen goods. Carding forums, in particular, are known for being
focused on trading financial information. However, little evidence exists about
the sellers that are present on carding forums, the precise types of products
they advertise, and the prices buyers pay. Existing literature mainly focuses
on the organisation and structure of the forums. Furthermore, studies on
carding forums are usually based on literature review, expert interviews, or
data from forums that have already been shut down. This paper provides
first-of-its-kind empirical evidence on active forums where stolen financial
data is traded. We monitored 5 out of 25 discovered forums, collected posts
from the forums over a three-month period, and analysed them quantitatively and
qualitatively. We focused our analyses on products, prices, seller prolificacy,
seller specialisation, and seller reputation
Master of sheets: A tale of compromised cloud documents
As of 2014, a fifth of EU citizens relied on cloud accounts to store their documents according to a Eurostat report. Although useful, there are downsides to the use of cloud documents. They often accumulate sensitive information over time, including financial information. This makes them attractive targets to cybercriminals. To understand what happens to compromised cloud documents that contain financial information, we set up 100 fake payroll sheets comprising 1000 fake records of fictional individuals. We populated the sheets with traditional bank payment information, cryptocurrency details, and payment URLs. To lure cybercriminals and other visitors into visiting the sheets, we leaked links pointing to the sheets via paste sites. We collected data from the sheets for a month, during which we observed 235 accesses across 98 sheets. Two sheets were not opened. We also recorded 38 modifications in 7 sheets. We present detailed measurements and analysis of accesses, modifications, edits, and devices that visited payment URLs in the sheets. Contrary to our expectations, bank payment URLs received many more clicks than cryptocurrency payment URLs despite the popularity of cryptocurrencies and emerging blockchain technologies. On the other hand, sheets that contained cryptocurrency details recorded more modifications than sheets that contained traditional banking information. In summary, we present a comprehensive picture of what happens to compromised cloud spreadsheets.Accepted manuscrip
Email Babel: Does Language Affect Criminal Activity in Compromised Webmail Accounts?
We set out to understand the effects of differing language on the ability of
cybercriminals to navigate webmail accounts and locate sensitive information in
them. To this end, we configured thirty Gmail honeypot accounts with English,
Romanian, and Greek language settings. We populated the accounts with email
messages in those languages by subscribing them to selected online newsletters.
We hid email messages about fake bank accounts in fifteen of the accounts to
mimic real-world webmail users that sometimes store sensitive information in
their accounts. We then leaked credentials to the honey accounts via paste
sites on the Surface Web and the Dark Web, and collected data for fifteen days.
Our statistical analyses on the data show that cybercriminals are more likely
to discover sensitive information (bank account information) in the Greek
accounts than the remaining accounts, contrary to the expectation that Greek
ought to constitute a barrier to the understanding of non-Greek visitors to the
Greek accounts. We also extracted the important words among the emails that
cybercriminals accessed (as an approximation of the keywords that they searched
for within the honey accounts), and found that financial terms featured among
the top words. In summary, we show that language plays a significant role in
the ability of cybercriminals to access sensitive information hidden in
compromised webmail accounts
Honeypot boulevard: understanding malicious activity via decoy accounts
This thesis describes the development and deployment of honeypot systems to measure real-world cybercriminal activity in online accounts. Compromised accounts expose users to serious threats including information theft and abuse. By analysing the modus operandi of criminals that compromise and abuse online accounts, we aim to provide insights that will be useful in the development of mitigation techniques. We explore account compromise and abuse across multiple online platforms that host webmail, social, and cloud document accounts. First, we design and create realistic decoy accounts (honeypots) and build covert infrastructure to monitor activity in them. Next, we leak credentials of those accounts online to lure miscreants to the accounts. Finally, we record and analyse the resulting activity in the compromised accounts. Our top three findings on what happens after online accounts are attacked can be summarised as follows. First, attackers that know the locations of webmail account owners tend to connect from places that are closer to those locations. Second, we show that demographic attributes of social accounts influence how cybercriminals interact with them. Third, in cloud documents, we show that document content influences the activity of cybercriminals. We have released a tool for setting up webmail honeypots to help other researchers that may be interested in setting up their own honeypots
The Cause of All Evils: Assessing Causality Between User Actions and Malware Activity
Malware samples are created at a pace that makes it difficult for analysis to keep up. When analyzing an unknown malware sample, it is important to assess its capabilities to determine how much damage it can make to its victims, and perform prioritization decisions on which threats should be dealt with first. In a corporate environment, for example, a malware infection that is able to steal financial information is much more critical than one that is sending email spam, and should be dealt with the highest priority. In this paper we present a statistical approach able to determine causality relations between a specific trigger action (e.g., a user visiting a certain website in the browser) and a malware sample. We show that we can learn the typology of a malware sample by presenting it with a number of trigger actions commonly performed by users, and studying to which events the malware reacts. We show that our approach is able to correctly infer causality relations between information stealing malware and login events on websites, as well as between adware and websites containing advertisements
SocialHEISTing: understanding stolen Facebook accounts
Online social network (OSN) accounts are often more usercentric
than other types of online accounts (e.g., email accounts)
because they present a number of demographic attributes
such as age, gender, location, and occupation. While
these attributes allow for more meaningful online interactions,
they can also be used by malicious parties to craft various
types of abuse. To understand the effects of demographic
attributes on attacker behavior in stolen social accounts, we
devised a method to instrument and monitor such accounts.
We then created, instrumented, and deployed more than 1000
Facebook accounts, and exposed them to criminals. Our results
confirm that victim demographic traits indeed influence
the way cybercriminals abuse their accounts. For example,
we find that cybercriminals that access teen accounts write
messages and posts more than the ones accessing adult accounts,
and attackers that compromise male accounts perform
disruptive activities such as changing some of their profile
information more than the ones that access female accounts.
This knowledge could potentially help online services develop
new models to characterize benign and malicious activity
across various demographic attributes, and thus automatically
classify future activity.Accepted manuscrip
Diverse Misinformation: Impacts of Human Biases on Detection of Deepfakes on Networks
Social media platforms often assume that users can self-correct against
misinformation. However, social media users are not equally susceptible to all
misinformation as their biases influence what types of misinformation might
thrive and who might be at risk. We call "diverse misinformation" the complex
relationships between human biases and demographics represented in
misinformation. To investigate how users' biases impact their susceptibility
and their ability to correct each other, we analyze classification of deepfakes
as a type of diverse misinformation. We chose deepfakes as a case study for
three reasons: 1) their classification as misinformation is more objective; 2)
we can control the demographics of the personas presented; 3) deepfakes are a
real-world concern with associated harms that must be better understood. Our
paper presents an observational survey (N=2,016) where participants are exposed
to videos and asked questions about their attributes, not knowing some might be
deepfakes. Our analysis investigates the extent to which different users are
duped and which perceived demographics of deepfake personas tend to mislead. We
find that accuracy varies by demographics, and participants are generally
better at classifying videos that match them. We extrapolate from these results
to understand the potential population-level impacts of these biases using a
mathematical model of the interplay between diverse misinformation and crowd
correction. Our model suggests that diverse contacts might provide "herd
correction" where friends can protect each other. Altogether, human biases and
the attributes of misinformation matter greatly, but having a diverse social
group may help reduce susceptibility to misinformation.Comment: Supplementary appendix available upon request for the time bein
Developing a hierarchical model for unraveling conspiracy theories
Abstract A conspiracy theory (CT) suggests covert groups or powerful individuals secretly manipulate events. Not knowing about existing conspiracy theories could make one more likely to believe them, so this work aims to compile a list of CTs shaped as a tree that is as comprehensive as possible. We began with a manually curated ‘tree’ of CTs from academic papers and Wikipedia. Next, we examined 1769 CT-related articles from four fact-checking websites, focusing on their core content, and used a technique called Keyphrase Extraction to label the documents. This process yielded 769 identified conspiracies, each assigned a label and a family name. The second goal of this project was to detect whether an article is a conspiracy theory, so we built a binary classifier with our labeled dataset. This model uses a transformer-based machine learning technique and is pre-trained on a large corpus called RoBERTa, resulting in an F1 score of 87%. This model helps to identify potential conspiracy theories in new articles. We used a combination of clustering (HDBSCAN) and a dimension reduction technique (UMAP) to assign a label from the tree to these new articles detected as conspiracy theories. We then labeled these groups accordingly to help us match them to the tree. These can lead us to detect new conspiracy theories and expand the tree using computational methods. We successfully generated a tree of conspiracy theories and built a pipeline to detect and categorize conspiracy theories within any text corpora. This pipeline gives us valuable insights through any databases formatted as text
Hives and Honeypots: Understanding Malicious Activity In Online Accounts
<div>3MT presented at the 2017 Defence and Security Doctoral Symposium.</div><div><br></div><div>Account credentials are attractive to cybercriminals who often seek ways to monetise the valuable and sensitive data in online accounts that such credentials guard. However, it is unclear what exactly cybercriminals do with compromised accounts after gaining access. To protect users, it is important for researchers and law enforcement agencies to understand the modus operandi of these criminals. To this end, my research focuses on understanding how cybercriminals compromise and abuse online accounts, with a view to providing insights that will be useful in the development of mitigation techniques. I have developed an open-source infrastructure that is capable of monitoring the activity of cybercriminals that connect to webmail accounts. Similarly, I have studied what happens to compromised documents in the cloud. During the 3MT presentation, I plan to present an overview of my work so far, and also provide a brief glimpse into what comes next.</div><div> </div